## Train by Reconnect: Decoupling Locations of Weights from their Values

Supplementary materials for *Train by Reconnect: Decoupling Locations of Weights from their Values*.

## Dependencies
Code in this repository requires:
- Python 3.6 or higher
- Tensorflow v2.1.0 or higher
and the requirements highlighted in [requirements.txt](./requirements.txt)

## Table of Contents
This repository contains the following contents:
- **[Appendix.pdf](./Appendix.pdf)**: Appendix containing extra details and results for the paper. Please note that it adds several new citations and thus begins with a section of  Reference before the actual contents.
- **train_by_reconnect**: Minimum code for reproducing main results mentioned in the paper. The code is commented and accompanied with working examples in [notebooks](./notebooks).
    - [LaPerm.py](./train_by_reconnect/LaPerm.py)
        - `LaPerm`: [Tensorflow](https://www.tensorflow.org/) implementation of LaPerm (Section 4).
        - `LaPermTrainLoop`: A custom train loop that applies LaPerm to [tensorflow.keras.Model](https://www.tensorflow.org/api_docs/python/tf/keras/Model).
    - [weight_utils.py](./train_by_reconnect/weight_utils.py)
        - `agnosticize`: Replace the weights in a model with a single shared value. (Section 5.5)
        - `random_prune`: Randomly prune the model. (Section 5.4)
    - [viz_utiles.py](./train_by_reconnect/viz_utils.py)
        - `Profiler`: Plot weight profiles for a given model. (Section 2)
        - `PermutationTracer`: Visualize and trace how the locations of weights has changed.       
- **notebooks**: [Jupyter-notebooks](https://jupyter.org/) containing the model definitions and experiment configurations for reconducting or extending the experiments (training + evaluation). Detailed instructions can be found inside the notebooks.
    - [`Conv2.ipynb`](./notebooks/Conv2.ipynb), [`Conv4.ipynb`](./notebooks/Conv4.ipynb), [`Conv13.ipynb`](./notebooks/Conv13.ipynb), [`Conv7.ipynb`](./notebooks/Conv7.ipynb), [`ResNet50.ipynb`]((./notebooks/ResNet50.ipynb)): For experiments mentioned in Section 5.1~5.4.
    - [`F1_and_F2.ipynb`](./notebooks/F1_and_F2.ipynb): For experiments mentioned in Section 5.5.
    - [`Weight_profiles.ipynb`](./notebooks/Weight_profiles.ipynb): For visualizations mentioned in Section 2.
- **pretrained**: pre-train weights for main results mentioned in the paper. (For detailed model definitions, please refer to 'notebooks`)
    | Models     | Top-1 | *p%* | *k* | Dataset | Section | Distribution| Note|
    | ---------- |:-----:| ----:| ---:| ------------:| -------:| -----------:| :-----------:| 
    | [Conv7](./pretrained/Conv7.h5)      | 99.72%| 0%   | 1200|   MNIST |     5.1 | He Uniform  |
    | [Conv4](./pretrained/Conv4.h5)      | 89.17%| 0%   | 1000| CIFAR-10|5.2, 5.4 | He Uniform  |
    | [Conv4](./pretrained/Conv4_70.h5)      | 87.61%| 70%   | 2000| CIFAR-10|5.2, 5.4 | He Uniform  |
    | [Conv13](./pretrained/Conv13.h5)     | 92.21%| 0%   | 1000| CIFAR-10|5.2, 5.4 | He Uniform  |
    | [ResNet50](./pretrained/resnet50_0.h5)   | 92.53%| 0%   |  400| CIFAR-10|     5.4 | He Uniform  |
    | [ResNet50](./pretrained/resnet50_30.h5)   | 92.32%| 30%  |  800| CIFAR-10|     5.4 | He Uniform  |
    | [ResNet50](./pretrained/resnet50_50.h5)   | 92.02%| 50%  |  800| CIFAR-10|     5.4 | He Uniform  |
    | [ResNet50](./pretrained/resnet50_70.h5)   | 90.97%| 70%  |  800| CIFAR-10|     5.4 | He Uniform  |
    | [F1](./pretrained/F1.h5)         | 85.46%| 40%  |  250|   MNIST |     5.5 | Shared 0.08 |
    | [F2](./pretrained/F2.h5)         | 78.14%| 92%  |  250|   MNIST |     5.5 | Shared 0.03 |
    > Some weights, e.g., Conv2, were not able to be included due to the 100MB limitation.
    - ***p%***: Percentage of weights that are randomly pruned before training, e.g., *p*=10% meaning 90% of weights are remained non-zero. (Section 5.4)

    - ***k***: Sync period used to perform the experiment. (Section 4)
    - Distribution: The random distribution of the trained weights.
        - He Uniform: [He et al. 2015](https://arxiv.org/abs/1502.01852)
        - Shared 0.08: the weights are sampled from the set {0, 0.08}.
        - Shared 0.03: the weights are sampled from the set {0, 0.03}.
    - Datasets: [MNIST](http://yann.lecun.com/exdb/mnist/), [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html).
- **networks**: Visual-based descriptions for the architectures used in Section 5.
- **weight_profiles**: Weight profiles of selected pre-trained models on ImageNet.

## Loading the pre-trained weights
1. Locate the weight's corresponding jupyter-notebook in [notebooks](./notebooks). For example, for the weight named `Conv7.h5`, please look for [Conv7.ipynb](./notebooks/Conv7.ipynb) for the model definition and experiment configurations.
2. Define the `model` as demonstrated in the notebook.
3. Load the weights to `model` by
    ```python
    model.load_weights('../pretrained/Conv7.h5')
    ```